-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(restore): adds --dc-mapping flag to restore command #4213
base: master
Are you sure you want to change the base?
Conversation
In terms of syntax:
I find it confusing that it's possible to specify both restored and skipped DCs in the same mapping key.
I would suggest to validate that positive and negative DC occurrences are not mixed in the mapping key and that negative DCs can't be mapped to anything, so that we avoid confusion and typing mistakes. EDIT: The same goes here:
The negative DC is placed in the mapping value alongside positive DC? But I'm still in favor of not mixing positive and negative DCs here.
On the other hand, if we allow no DCs on either side of
Alternative approach would be to add some special characters for distinguishing between src and dst DCs, but I guess that the relation to the |
@VAveryanov8 I made a general review, but I didn't dive into other details, because it would be better to first discuss the current comments (other things might not matter after that). |
Also, we should deprecate the datacenter from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VAveryanov8 Thanks for the extensive PR and for covering a lot of different scenarios, but we don't need such a complex logic.
It's enough to do the one to one mapping. Instead of allowing to merge multiple DCs in source or multiple DCs in target.
Let's discuss the goal of DC mapping on the call.
It's enough to have a possibility of:
- restoring just a single DC (that's why I don't really see the reason for introducing another flag that skips validation)
- explicitly defining that nodes from target DCX are expected to download data of source DCY and call to load & stream
pkg/service/restore/worker.go
Outdated
return nil, errors.Wrap(err, "get status") | ||
} | ||
|
||
sourceMap, targetMap := target.DCMappings.calculateMappings() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: sourceMap => sourceDC2TargetDCMap , targetMap => targetDC2SourceDCMap
pkg/command/restore/dcmappings.go
Outdated
|
||
type dcMapping struct { | ||
Source []string `json:"source"` | ||
Target []string `json:"target"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DC mapping should be single string to single string.
DC1 (of the source cluster) => DC2 (of the destination cluster) ; DC2 (of the source) => DC1 (of the target)
And that's it, no need for the complex logic.
4ed7c9d
to
d237454
Compare
@VAveryanov8 I see some recent pushes to this PR, is it ready for re-review ? |
Yes, it's ready for re-review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks !
@mikliapko There are SCT tests that were trying to restore multiDC cluster, but they were failing due to the problems with encryption at rest enabled on the cluster.
Two datacenters were using different encryption keys.
This PR is expected to fix this problem. AFAIR, you changed these tests to singleDC later. Is it possible to validate this PR against the SCT tests with the multiDC restore with encryption at rest enabled ?
@VAveryanov8 Please rebase on master. You will have conflicts related to the new submodule, but they are suppose to be very easy to fix. |
Yep, we can do it, just give me the SM build that can be used |
@mikliapko Here is the manager-build triggered on the current branch https://jenkins.scylladb.com/view/scylla-manager/job/manager-master/job/manager-build/919/ You will have to edit the test to include |
f0df68a
to
9852f42
Compare
This adds support for --dc-mapping flag to restore command. It specifies mapping between DCs from the backup and DCs in the restored(target) cluster. All DCs from the source cluster should be explicitly mapped to all DCs in the target cluster. The only exception is when source and target cluster has exact match: source dcs == target dcs. Only works with tables restoration (--restore-tables=true). Syntax: "source_dc1,source_dc2=>target_dc1,target_dc2" Multiple mappings are separated by semicolons (;). Exclamation mark (!) before a DC indicates that it should be ignored during restore. Examples: "dc1,dc2=>dc3" - data from dc1 and dc2 DCs should be restored to dc3 DC. "dc1,dc2=>dc3,!dc4" - data from dc1 and dc2 DCs should be restored to dc3 DC. Ignoring dc4 DC from target cluster. "dc1,!dc2=>dc2" - data from dc1 should be restored to dc2 DC. Ignoring dc2 from source cluster. Fixes: #3829
This introduces use of dc mappings when restoring tables. Now each dc is downloading only data from corresponding dc(s) accordingly to user provided mapping. Also some dcs can be explicitly ignored. Fixes: #3829
This adds another cluster to docker setup, so we can have integration tests for dc-mappings. Fixes: #3829
This removes support for !dc syntax and introduces new --skip-dc-mapping-validation flag which can be used when partial restore is needed.
This removes third cluster, but adds another data center to the second cluster.
This introduces LocationInfo which is a direct replacement of locationHosts, but it contains more information about Location, like what DC are actually stored in this Location, what hosts can access it and the list of manifests from this location. Also LocationInfo is created with the respect of dc mappings.
This simplifies `--dc-mapping` usage only for 1 use case: 1-1 dc mapping. This means that squeezing (restore dc1 and dc2 into dc3) or extending (restore dc1 into dc1 and dc2) DCs is not supported when `--dc-mapping` is provided. So the syntax is also simplified: ``` source_dc1=>target_dc1;source_dc2=>target_dc2 Where => is used to separate source dc name and target dc name ; is used to separate multiple mappings ``` If `--dc-mapping` is not provided, then current behavior should be preserved - each node with access to DC can download it data. Also it's allowed to provide only subset of DCs, ignoring source dc or target (or both).
This removed dc3 from second cluster as `--dc-mappings` was simplified and there is no necessity in having cluster with another dc name.
As this pr does not introduce any breaking changes anymore, integration tests can be simplified as well.
This makes locationInfo.AllHosts to return only uniq hosts. Also some small fixes in tests and in error messages.
This fixes the behavior when dc-mapping is not set.
9852f42
to
99af84e
Compare
Testing with the provided build. To make it work, dc mapping should be taken in quotes (see examples below):
@VAveryanov8 @karol-kokoszka Is it inevitable in this case? |
For this particular syntax yes, quotes are required. But the good news is that I'm gonna get rid of it towards more simple syntax using just equal(=) instead of array(=>) and comma(,) instead of semi-column(;). Then I suppose it will be possible to use without quotes. |
Great, please, let me know when you will have the build with new implementation |
To run a restore task for multiDC cluster with EaR enabled (otherwise fails #1), the special flag has been introduced in Manager (#2) to map backed up DC with DC under restore, for example, sctool restore ... --dc-mapping dc1=dc1,dc2=dc2 The change introduces this new flag into `create_restore_task` method and makes sure that if Scylla cluster has more than 1 datacenter - the restore task will be triggered with this flag applied. refs: 1. scylladb/scylla-manager#3871 2. scylladb/scylla-manager#4213
This simplifies syntax of dc-mapping by leveraging pflag.StringToString functionality. Also extends integration tests with validation that each node should download only tables from corresponding DCs when dc-mapping is provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks nice!
Just some question about the comment.
@mikliapko Here is the latest build from this branch https://jenkins.scylladb.com/view/scylla-manager/job/manager-master/job/manager-build/923/ |
To run a restore task for multiDC cluster with EaR enabled (otherwise fails #1), the special flag has been introduced in Manager (#2) to map backed up DC with DC under restore, for example, sctool restore ... --dc-mapping dc1=dc1,dc2=dc2 The change introduces this new flag into `create_restore_task` method and makes sure that if Scylla cluster has more than 1 datacenter - the restore task will be triggered with this flag applied. refs: 1. scylladb/scylla-manager#3871 2. scylladb/scylla-manager#4213
Retesting here https://jenkins.scylladb.com/view/staging/job/scylla-staging/job/mikita/job/manager-master/job/ubuntu22-sanity-test/69/ |
This adds support for
--dc-mapping
flag to restore command. It specifies mapping between DCs from the backup and DCs in the restored(target) cluster. Only 1 use case is supported: 1-1 dc mapping. This means that squeezing (restore dc1 and dc2 into dc3) or extending (restore dc1 into dc1 and dc2) DCs is not supported when--dc-mapping
is provided.So the syntax is:
If
--dc-mapping
is not provided, then current behavior should be preserved - each node with access to DC can download it data. Also it's allowed to provide only subset of DCs, ignoring source dc or target (or both).Only works with tables restoration (
--restore-tables=true
).Fixes: #3829
Please make sure that: